Path: blob/master/Part 9 - Dimension Reduction/Principal Component Analysis/[Python] Principal Component Analysis.ipynb
1336 views
Kernel: Python 3
Principal Component Analysis
Data preprocessing
In [1]:
In [2]:
Out[2]:
In [3]:
Out[3]:
In [4]:
In [5]:
Out[5]:
array([ 14.23, 1.71, 2.43, 15.6 , 127. , 2.8 ,
3.06, 0.28, 2.29, 5.64, 1.04, 3.92, 1065. ])
In [6]:
In [7]:
Applying Pricipal Componene Analysism
First lets see the variance by the 13 independent variables. For this we need to create an object of PCA with n_components parameter as None
Then lets print explained variance ratio.
This shows principal component that explain the variance in decending order.
Here the 1st principal component expalains 36% of the variance. top two expalins 55% (i.e. 36%+19%) of the variance. Now we need to select the first 2 principal components.
In [8]:
In [9]:
Out[9]:
array([ 0.35900066, 0.18691934])
In [10]:
Out[10]:
array([ 2.06784347, -1.02818265])
In [11]:
Out[11]:
array([-1.16602698, -3.61532732])
Fitting Logistic Regression to the Training Set
In [12]:
Out[12]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
penalty='l2', random_state=42, solver='liblinear', tol=0.0001,
verbose=0, warm_start=False)
Predicting Test set result
In [13]:
In [15]:
Out[15]:
array([1, 1, 3, 1, 2, 1, 2, 3, 2, 3])
In [16]:
Out[16]:
array([1, 1, 3, 1, 2, 1, 2, 3, 2, 3])
Making the Confusion Matrix
In [17]:
Out[17]:
array([[14, 0, 0],
[ 0, 14, 0],
[ 0, 0, 8]])
Here we have almost no incorrect prediction at all.
Accuracy
In [18]:
Out[18]:
1.0
Visualizing the training set results
In [20]:
Out[20]:
<matplotlib.legend.Legend at 0x7f75a22c4a58>
Visualizing the test set results
In [21]:
Out[21]:
<matplotlib.legend.Legend at 0x7f75a1294780>
In [ ]: